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Abstract - High speed and low power consumption is one of the 
most important design objectives in integrated circuits. Digital 
multipliers are most critical functional units. The overall 
performance of these system depends on the throughput of 
multiplier design. Aging problem of transistors has a significant 
effect on performance of these systems and in long term, the 
system may fail due to timing violations. Aging effect can be 
reduced by using over-design approaches, but these leads to 
area, power inefficiency. Hence to reduce the maximum power 
consumption and delay, variable latency multiplier with 
adaptive hold logic is used. The multiplier is able to provide 
higher throughput through the variable latency and can adjust 
the AHL circuit to mitigate performance degradation that is due 
to the aging effect. The proposed architecture can be applied to 
image multiplication. Based on the idea of razor flip flop and 
adaptive hold logic the timing violations are reduced. In the 
fixed latency usage of clock cycles is increased. The reexecution 
of clock cycles is reduced by using variable latency. 


Index Terms - Adaptive hold logic; Negative bias 
temperature instability; Variable latency ; Fixed latency 

I. INTRODUCTION 

Multiplication is an essential arithmetic operation for 
common DSP applications, such as filtering and fast Fourier 
transform (FFT). To achieve high execution speed, parallel 
array multipliers are widely used. These multipliers tend to 
consume most of the power in DSP computations, and thus 
power-efficient multipliers are very important for the design 
of low-power DSP systems. If the multipliers are too slow, the 
performance of entire circuits will be reduced. 

Traditional circuits use critical path delay as the overall 
circuit clock cycle in order to perform correctly. Hence, the 
variable-latency design was proposed to reduce the timing 
waste of traditional circuits. The variable-latency design 
divides the circuit into two parts they are shorter paths and 
longer paths. Shorter paths can execute correctly in one cycle, 
whereas longer paths need two cycles to execute. When 
shorter paths are activated frequently, the average latency of 
variable-latency designs is better than that of traditional 
designs. It is well known that multipliers consume most of the 
power in DSP computations. In this paper, we presents low 
power Column bypass multiplier and row bypass multiplier 
design methodology that inserts more number of zeros in the 
multiplicand and multiplier thereby reducing the number of 
delay as well as power consumption. The delay and power are 
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reduction are depends on the input bit coefficient. This means 
if the input bit coefficient is zero, corresponding row or 
column of adders need not be activated. Negative bias 
temperature instability (NBTI) occurs when a PMOS 
transistor is under negative bias, results in aging effect. Aging 
effect degrades transistor speed by increase in threshold 
voltage, which results in real time delay problems. The 
corresponding effect on an nMOS transistor is positive bias 
temperature instability (PBTI), which occurs when an nMOS 
transistor is under positive bias. Traditional methods to 
reduce this aging effect were area and power inefficient. 

In variable- latency design, shortest paths are assigned to 
be executed within one cycle and longest paths within two or 
more cycle. When shorter paths are activated frequently, the 
average latency of variable-latency designs is better than that 
of fixed latency designs. The main objective of the work is to 
multiply two images using low power variable latency 
multiplier with AH logic. The coding can be synthesized by 
the Xilinx ISE Design Suite, simulated using Model-Sim 
simulator. The major application of these digital multiplier 
are in the field of digital filtering and signal processing. 

II. IMPACT OF NBTI ON THE PERFORMANCE 
DEGRADATION OF DIGITAL CIRCUITS 

NBTI is a major side effect on the lifetime reliability of 
integrated circuits. With the continuous scaling of transistor 
dimensions, the reliability degradation of circuits has become 
an important issue. Due to an increasing electric field across 
the thin oxide, the generation of interface traps under negative 
bias temperature instability (NBTI) in pMOS transistors has 
become one of the most critical reliability issues that 
determine the lifetime of CMOS devices. Due to NBTI, the 
threshold voltage of the transistor increases with time 
resulting in the reduction in drive current, which in turn 
results in temporal performance degradation of circuits. 

Vth Degradation Model: NBTI is the result of trap 
generation at Si/SiO interface in negatively based PMOS 
transistors at elevated temperatures. The interaction of 
inversion layer holes with hydrogen passivated Si atoms can 
break the SiH bonds, creating an interface trap and one H 
atom that can diffuse away from the interface or can anneal an 
existing trap. 

Gate Delay Degradation Model: Delay of gate depends on 
threshold voltage value. Therefore, by monitoring the 
threshold voltage degradation, the change in gate delay can be 
easily estimated with a high degree of accuracy. 
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From the above analysis, it is clear that circuit delay 
depends on threshold voltage variation and hence 
performance degradation will occur. Various techniques used 
to mitigate this effect is Vdd tuning, PMOS sizing, Tuning of 
gate length and Tuning of switching frequency. By this 
techniques we can reduce NBTI to a greater extend but area, 
power inefficiency is a major problem. 

III. BYPASSING TECHNIQUE 

Dynamic power consumption can be reduced by 
bypassing method when the multiplier has more zeros in input 
data. To perform isolation, transmission gates can be used, as 
ideal switches with small power consumption, propagation 
delay similar to the inverter and small area. To study the 
proposed design we have consider column bypassing 
multiplier in which columns of adders are bypassed. In this 
multiplier, the operations in a column can be disabled if the 
corresponding bit in the multiplicand is 0. The advantage of 
this multiplier is it eliminates the extra correcting circuit. 

Bypassing multipliers are modification of normal array 
multipliers. Dynamic power consumption can be reduced by 
bypassing method when the multiplier has more zeros in input 
data. The path delay for an operation is strongly tied to the 
number of zeros in the multiplicands in the column- bypassing 
multiplier. The column bypassing multiplier (CBM) only 
needs two tristate gates and one multiplexer in a adder cell. 
Traditional filter design using bypassing multiplier does not 
consider variable latency technique. However, no digital FIR 
Filter using variable latency based multiplier that considers 
the aging effect and can adjust dynamically has yet been 
developed. 

IV. METHODOLOGY 
A) ARRAY MULTIPLIER 

The array multiplier is a fast parallel multiplier and is 
shown in Fig.l and it consists of (n-1) rows of carry save 
adder, in which each row contains (n-1) full adders. Each full 
adder in the carry save adder array has two outputs they are 
the sum bit goes down and the carry bit goes to the lower left 
full adder. The last row is a ripple adder for carry propagation 



B) COLUMN BYPASSING MULTIPILER 

A column-bypassing multiplier is an improvement of the 
array multiplier and is shown in Fig.2. A low-power 
column-bypassing multiplier design is proposed in which the 
full adder operations are disabled if the corresponding bit in 
the multiplicand is zero. Supposing the inputs are 1010 * 
1111, it can be seen that for the full adders in the first and third 
diagonals, two of the three input bits are 0 and the carry bit 
from its upper right full adder and the partial product aibi. 



Fig.2. Column bypassing multiplier 

The multiplicand bit ai can be used as the selector of the 
multiplexer to decide the output of the full adder, and ai can 
also be used as the selector of the tristate gate to turn off the 
input path of the full adder. If ai is 0, the inputs of full adder 
are disabled, and the sum bit of the current full adder is equal 
to the sum bit from its upper full adder, thus reducing the 
power consumption of the multiplier. If ai is 1, the normal 
sum result is selected. 

C) Variable Latency Design 

The variable-latency design was proposed to reduce the 
timing waste occurring in traditional circuits that use the 
critical path cycle as an execution cycle period. The basic 
concept is to execute a shorter path using a shorter cycle and 
longer path using two cycles. Since most paths execute in a 
cycle period that is much smaller than the critical path delay, 
the variable-latency design has smaller average latency. 

D) Razor flip flop 

Razor flip-flops can be used to detect whether timing 
violations occur before the next input pattern arrives. A 1-bit 
Razor flip-flop contains a main flip-flop, shadow latch, XOR 
gate, and multiplexer. The main flip-flop catches the 
execution result for the combination circuit using a normal 
clock signal, and the shadow latch catches the execution result 
using a delayed clock signal, which is slower than the normal 
clock signal. If the latched bit of the shadow latch is different 
from that of the main flip-flop, this means the path delay of the 
current operation exceeds the cycle period, and the main 
flip-flop catches an incorrect result. If errors occur, the Razor 
flip-flop will set the error signal to 1 to notify the system to 
reexecute the operation and notify the AHL circuit that an 
error has occurred. 
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E) Adaptive hold logic 

The Adaptive Hold Logic circuit is the key component of 
variable-latency multiplier. The AHL circuit contains 
decision block, MUX and a D flip-flop. If the cycle period is 
too short, the column-bypassing multiplier is not able to 
complete these operations successfully, causing timing 
violations. These timing violations will be caught by the 
Razor flip-flops, which generate error signals. If errors 
happen frequently, it means the circuit has suffered significant 
timing degradation due to the aging effect. 



When an input pattern arrives, decision block will decide 
whether the pattern requires one cycle or two cycles to 
complete and pass both results to the multiplexer. The 
multiplexer selects one of either result based on the output of 
the razor flip-flop. Then an OR operation is performed 
between the result of the multiplexer, and the Q bar signal is 
used to determine the input of the D flip-flop. When the 
pattern requires one cycle, the output of the multiplexer is 1. 
The (gating) signal will become 1, and the input flip-flops will 
latch new data in the next cycle. On the other hand, when the 
output of the multiplexer is 0, which means the input pattern 
requires two cycles to complete, the OR gate will output 0 to 
the D flip-flop. Therefore, the (gating) signal will be 0 to 
disable the clock signal of the input flip-flops in the next 
cycle. 

V. VARIABLE LATENCY MULTIPLIER DESIGN WITH 

AHL 

Aging aware multiplier design includes two m-bit inputs 
(m is a positive number), one 2m-bit output, one 
column-bypassing multiplier, 2m 1-bit Razor flip- flops, and 
an AHL circuit. Clock is provided by the AND gate at the 
input. 

When input patterns arrive, the column-bypassing 
multiplier and the AHL circuit execute simultaneously. 
Depending on the number of zeros in the multiplicand, the 
AHL circuit decides number of clock cycles required for the 
current input pattern. If the input pattern requires two cycles 


to complete, the AHL will output 0 to disable the clock signal 
of the input flip-flops. Otherwise, the AHL will output 1 for 
normal operations. 
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Fig.4. Variable latency multiplier with AHL 

When the column bypassing multiplier completes the 
operation, the result will be passed to the Razor flip-flops. 
The Razor flip-flops check for any path delay timing 
violations. If timing violations occur, it means that the cycle 
period is not long enough for the current operation to 
complete and that the execution result of the multiplier is 
incorrect. Thus, the Razor flip-flops will output an error to 
inform the system that the current operation needs to be 
re-executed using two cycles to ensure the operation is 
correct. 


VI. APPLICATION : IMAGE PROCESSING 

In this section the application of the proposed multipliers to 
image processing is illustrated. A multiplier is used to 
multiply two images on a pixel by pixel basis, thus blending 
the two images into a single output image. 


VII. RESULT AND DISCUSSION 



Fig.5. Simulation result of 16*16 fixed latency 
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Fig.6. Simulation result of 16*16 variable latency 
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TABLE I 

SIMULATION RESULT 


Multiplier 

PARAMETERS 

Delay (ns) 

Power (mW) 

Area (Gate count) 

Fixed 

latency 

6.199 

129 

12,826 

Variable 

latency 

3.076 

49 

8,199 


In the fixed latency the more number of clock cycle are 
required and due to which the area, power and delay are 
increased. In the variable latency lesser number of clock cycle 
are used, the error is reduced so that the area, power and delay 
are reduced. 
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VIII. CONCLUSION 

Variable-latency design minimizes the timing waste of the 
noncritical paths. This multiplier is able to adjust the adaptive 
hold logic to mitigate performance degradation due to delay 
problems. Variable-latency design can adjust clock cycle 
required by input patterns. And hence variable latency 
multipliers have less performance degradation when 
compared with traditional fixed latency multipliers, which 
needs to consider the degradation by both the NBTI effect and 
use the worst case delay as the cycle period. This can be used 
as an application of image multiplication. We can multiply 
two images using low power variable latency multiplier with 
AH logic, can be enhanced by reduced delay, area and power. 
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